iSS'  (■  CA  ■  -ON 


Unc lass i f ied 


MU 


t*  SAME  OF  »€«FORMING  0«G»Nii*TlOX 

Utah  State  University 


6c.  ADDRESS  /City  SIAM  «nd  ///>  Cocci 

Logan,  UT  84322-8200 


3  D'STfl'0uTiONiAvAM.Ag'i.lTv  C*  aePCR” 

Approved  for  Public  Release 
Distribution  Unlimited 


5  MONITORING  ORGANIZATION  REPORT  NUMBER'S' 

AFOSR-TR-  9  0  1  1  0  0 


7«.  NAME  OF  MONITORING  ORGAN i 2a  t  i on 

AFOSR/NL 


7b.  AOORCSS  Start  «Ad  ZIP  Code  < 

Bolling  AFB ,  OC  20332-6448 


So  OFFICE  SYMBOL  9.  FROCOREMfNT  INSTRUMENT  IDENTIFICATION  NuM0E« 
(If  appltcnbtm i 

H//  Grant  AFOSR  -  89-0509 


Ac  AOORCSS  rCtly.  Stmt *  and  ZIP  Codai 

■8  >4  Ht0  „ 


10  SOURCE  OF  FUNOING  NOS 


PROGRAM 

element  no 


UUt>x(- 


project 

no 


WORK  „  v  “ 
VC 


i2.  personal  autmor<s»  Doucette,  W.J.;  Stevens,  D.K.  ;  Dupont,  R.R.;  McLean,  J.E.;  Denne,  D.  ; 
Holt,  M. 


13a  VYPf  OF  RfFORT  13b  Time  COVERED  14  OaTE  OF  REPORT  iYr  Mo  .Dmyi  IS  PAGE  COUNT 

Annual  from  t0  1990/9/14 


COSATi  COOES 


IS.  SUBJECT  terms  'Commw  ON  mvmrma  I f  nacmtamry  and  i dmnttfy  by  bloc*  number, 
Pollution,  QSARs,  Expert  System,  Organic  Chemical, 

Property  Estimation 


ABSTRACT  i  Coot  mum  or  rmvmrmm  tf  nmcmmmry  mod  identify  by  bioc •  nmaNpi 

A  microcomputer  based  Property  Estimation  Program  (PEP)  and  Database  (DB),  utilizing 
molecular  connectivity  indices  (MCI )-property  and  property-property  correlations,  as  well 
as  UNIFAC  derived  activity  coefficients,  has  been  designed  to  provide  both  experts  and  non¬ 
experts  with  a  fast,  economical  method  to  estimate  compound  aqueous  solubility,  octanol/ 
water  partition  coefficient,  vapor  pressure,  organic  carbon  normalized  soil  sorption 
coefficient,  BCF,  and  Henry's  Law  constant  for  use  in  environmental  fate  modeling.  The 
user  can  input  the  required  structural  information  using  either  Simplified  Molecular  Input 
Line  Entry  System  (SMILES)  notation  or  connection  tables  generated  from  two  commercially 
available  two-dimensional  drawing  programs,  ChemDraw  or  Chemlntosh  .  Estimates  of 
predictor  accuracy  are  provided  along  with  the  estimated  property  values.  The  development 
and  current  status  of  the  PEP-DB  program  is  described.  far 


20  OISTRlBUTtON/AVAiLABlLlTV  OF  ABSTRACT 

UNCL ASS* F lEO/UNL iMl TE O  G  SAME  AS  RFT  G  OTIC  USERS  C 


22a  NAME  OF  RESFONSlBLE  iNO'ViOual 

/ .  L-f  Co ( 


OD  FORM  1473,  83  APR  iumium  up  i 


21  ABSTRACT  SECURITY  CLASSIFICATION 


<\<LSLc*s*r> 


22 c  OFFICE  Symbol 


COITION  OF  i  j AN  73  IS  OBSOLETE 


SECURITY  CLASSIFICATION  of  t~'S 

7  SEP  iggo 


* 


ENVIRONMENTAL  CONTAINMENT  PROPERTY  ESTIMATION  USING  QSARs  IN 
AN  EXPERT  SYSTEM 


William  J.  Doucette 
David  K.  Stevens 
R.  Ryan  Dupont 
Joan  E.  McLean 
Doug  Denne 
Mark  Holt 

Utah  State  University 

Utah  Water  Research  Laboratory 

Logan,  UT  84322-8200 

September  14,  1990 

Annual  Report  for  Period  15  August  1989  -  15  August  1990 


Prepared  for 

U.S.  AIR  FORCE  OF  SCIENTIFIC  RESEARCH 
Bolling  AFB,  OC  20332-6448 


4 


TABLE  OF  CONTENTS 


Executive  Summary . I 

Objectives  or  Statement  of  Work  -  WJD  . 3 

Background  and  Significance . 4 

Nature  of  the  Problem . 4 

Quantitative  Quantitative  Structure-activity  relationships  (QSARs) . 4 

Correlations  between  the  property  of  interest  and  another  more  easily  obtained  property  5 

Fragment  constant  methods . 5 

Correlations  Between  the  Property  of  Interest  and  Topological  Indexes . 7 

Theoretically  Derived  Equations . 8 

Problems  associated  with  the  estimation  methods  (or  the  need  for  a  decision  suport)  .  8 

Status  of  Research  Effort . 9 


Introduction . 9 

Overview  of  PEP  and  chemical  property  database . 9 

Computer  Hardward/software  requirements  and  development  tools .  10 


PEP  software  overview . 

PEP  software  components . 

1  lyperTalk  Scripts  . 

I  lypercard  external  commands  and  functions  . 

External  applications  . 

PEP  software  tools . 

HyperCard  . 

Think  C . 

XTRA . 

ResEdit . 

Progress  SCMD . 

System  Requirements . 


10 

10 


12 

12 

12 

13 

13 

13 


Chemical  Property  Database 


13 


Description .  13 

MCI  Based  Property  Estimation  Module .  15 

Overview .  15 

Calcualtion  of  MCIs .  15 


Development  of  MCI-Property  Relationships  .  18 

Statistical  Evaluation  of  MCI-Property  Relationships  .  19 

Examination  of  residuals .  19 

Analysis  of  variance .  21 


ii 


TABLE  OF  CONTENTS  (CONT'D) 


Pace 


Student’s  t-test  for  the  significance  of  variables .  22 

Precision  of  the  predicted  value .  23 

Preliminary  Results  from  MCI-Property  Relationships .  23 

UNIFAC  Module .  25 

Overview  .25 

Calculation  of  UNIFAC  derived  activity  coefficients .  27 

Estimation  of  aqueous  solubility  and  octanol/water  partition  coefficients  for 

UNIFAC  derived  activity  coefficients .  27 

Property/property  correlation  module .  28 

Summary  of  First  Year  Accomplishments .  29 

Second  Year  Objectives .  29 

Miscellaneous  Publications  .  30 

List  of  Papers  Presented  at  Professional  Meetings .  30 

List  of  Graduate  Students  Associated  with  the  Research  Effort .  31 

References .  32 


Appendix  A 
Appendix  B 
Appendix  C 


Accession  For  ^  j 

NTIS  GRA&I 
DTIC  TAB 
Unannounced 
Justlflcatl 

OK 

□ 

□ 

on  . 

By  1 

bistributlo 

Avallabill 

0/ 

ty  Codas 

Dist 

Avail 

Speo 

and/or 

led 

LIST  OF  TABLES 


Table  Page 

1  Current  number  of  components  in  chemical  property  database  listed  by  property  14 

2  Sample  Anova  table .  22 

LIST  OF  FIGURES 

Figure  Page 

1  Flow  chart  overview  of  PEP.DB .  10 

2  View  of  PEP's  chemical  property  database  .14 

3  Flow  chart  depicting  operation  of  MCI  modeule .  16 

4  Delta  values  calculated  for  phenol  .  16 

5  Four  types  of  graph  fragments .  17 

6  Examples  of  residual  plots .  20 

7  Experimental  versus  estimated  (MCI  Universal)  log  Kow  .  24 

8  Experimental  versus  estimated  (MCI  four  general  equations)  log  Kow .  24 

9  Experimental  versus  estimated  (ClogP)  log  Kow .  23 

10.  Flow  chart  depicting  operation  of  UNIFAC  module .  26 

11  Flow  chart  depicting  operation  of  UNIFAC  module .  28 


IV 


EXECUTIVE  SUMMARY 


In  order  to  assess  the  potential  impact  of  the  accidental  introduction  of  an  organic  chemical 
into  the  environment,  information  is  needed  concerning  its  environmental  fate.  The  fate  of  an 
organic  chemical  in  the  environment  depends  on  a  variety  of  physical,  chemical  and  biological 
processes.  Mathematical  models,  which  attempt  to  integrate  these  processes,  are  widely  used 
to  predict  the  transport  and  distribution  of  organic  contaminants  in  the  environment.  Use  of 
these  models  requires  a  variety  of  input  parameters  which  describe  site  and  contaminant 
physical-chemical  and  biological  characteristics.  Several  important  contaminant  properties  used 
to  assess  the  mobility  and  persistence  of  a  chemical  are  aqueous  solubility,  octanol/waicr 
partition  coefficient,  soil/water  sorption  coefficient,  Henry's  Law  constant,  bioconcentration 
factor,  and  transformation  rates  for  biodegradation,  photolysis  and  hydrolysis. 

One  major  limitation  to  the  use  of  environmental  fate  models  has  been  the  lack  of  suitable 
values  for  many  of  these  properties.  The  scarcity  of  data,  due  mainly  to  the  difficulty  and  cost 
involved  in  experimental  determination  of  such  properties,  has  resulted  in  an  increased  reliance 
on  the  use  of  estimated  values  for  many  applications. 

Quantitative  Structure-Property  Relationships  (QSPRs)  and  Quantitative  Property-Property 
Relationships  (QPPRs)  are  methods  by  which  properties  of  a  chemical  can  be  estimated  from  a 
knowledge  of  the  structure  of  a  molecule  or  from  another  more  easily  obtained  property. 
Selection  and  application  of  the  most  appropriate  QSPRs  or  QPPRs  for  a  given  compound  is 
based  on  several  factors  including:  the  availability  of  required  input,  the  methodology  for 
calculating  the  necessary  topological  information,  the  appropriateness  of  a  correlation  to  the 
chemical  of  interest,  and  an  understanding  of  the  mechanisms  controlling  the  property  being 
estimated. 

A  microcomputer  based  Property  Estimation  Program  and  Database  (PEP-DB),  utilizing 
molecular  connectivity  indices  (MCI)-property  and  property-property  correlations  and 
UNIFAC  derived  activity  coefficients,  is  being  developed  to  provide  both  experts  and  non- 
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experts  with  a  fast,  economical  method  to  estimate  a  compound's  aqueous  solubility, 
octanol/water  partition  coefficient,  vapor  pressure,  organic  carbon  normalized  soil  sorption 
coefficient  (Koc),  bioconcentration  factor  (BCF),  and  Henry's  Law  constant  for  use  in 
environmental  fate  modeling.  The  user  can  input  the  required  structural  information  using 
either  Simplified  Molecular  Input  Line  Entry  System  fSMILES)  notation  or  connection  tables 
generated  from  two  commercially  available  twodimensional  drawing  programs,  ChemDraw™ 
or  Chemlntosh™.  Estimates  of  predictor  accuracy  are  provided  along  with  the  estimated 
property  values.  This  report  describes  the  development  and  current  status  of  PEP-DB. 
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OBJECTIVES  OR  STATEMENT  OF  WORK-WJD 


The  primary  goal  of  this  project  is  to  develop  a  microcomputer-based  expert  system 
utilizing  Quantitative  Structure  Activity  Relationships  (QSARs)  to  predict  the  physical-chemical 
properties  of  an  organic  chemical  which  are  necessary  to  model  its  environmental  fate.  The 
specific  properties  that  are  being  investigated  include:  aqueous  solubility  (S),  vapor  pressure 
(Vp),  organic  carbon  normalized  soil/water  partition  coefficient  (Koc),  Henry's  Law  constant 
(H),  and  bioconcentration  factor  (BCF). 

In  order  to  achieve  the  primary  goal  of  this  research,  the  following  specific  objectives  are 
being  accomplished: 

1 .  To  compile  an  accurate  database  of  experimentally  determined  values  of  aqueous  solubility, 
vapor  pressure,  soil/water  partition  coefficient,  Henry’s  Law  constant,  and 
bioconcentration  and  bioaccumulation  factors  for  a  wide  variety  of  organic  compounds. 
The  database  includes  compounds  exhibiting  a  broad  range  of  physical  and  chemical 
properties  and  expected  mobility  and  persistence. 

2.  Using  the  database  developed  in  Objective  1,  evaluate  and  refine  existing  methods  and/or 
develop  new  methods  for  estimating  these  contaminant  properties  using  QSARs. 

3  Develop  a  microcomputer-based  decision  support  system  which  incorporates  the  methods 
developed  in  Objective  2,  to  allow  the  prediction  of  environmental  fate  and  transport 
properties  of  an  organic  contaminant  upon  inputting  its  structure.  An  estimate  of  the 
accuracy  of  the  predicted  value  is  also  provided  from  the  decision  support  system. 

4.  Test  the  ability  of  the  decision  support  system  developed  in  Objective  3  to  provide  an 
accurate  estimate  of  these  environmental  fate  and  transport  properties.  This  will  be  done 
using  a  test  set  of  chemicals  of  interest  to  the  USAF  (solvents,  fuels,  pesticides)  for  which 
accurate  experimental  values  are  available. 
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BACKGROUND  AND  SIGNIFICANCE 

Nature  of  the  problem 

In  order  to  assess  the  potential  impact  that  the  introduction  of  an  organic  chemical  into  the 
environment  will  have,  information  is  needed  concerning  its  environmental  fate. 
Environmental  fate  encompasses  the  transport  and  degradation  processes  which  determine  the 
behavior  of  a  chemical  released  into  the  environment.  The  fate  of  an  organic  chemical 
introduced  into  the  environment  depends  on  a  variety  of  physical,  chemical  and  biological 
processes.  Mathematical  models,  which  attempt  to  integrate  these  processes,  are  widely  used 
to  predict  the  environmental  transport  and  distribution  of  organic  contaminants.  Use  of  these 
models  requires  a  variety  of  input  parameters  concerning  site  and  contaminant  physical- 
chemical  and  biological  characteristics.  Several  important  contaminant  properties  used  to  assess 
the  mobility  and  persistence  of  a  chemical  are  listed  below: 

Mobility  Persistence 

Henry's  Law  constant  Biodegradation  Rate 

(or  vapor  pressure  and  aqueous  solubility)  Photolysis  Rate 

Bioconcentration  factor  Hydrolysis  Rate 

Soil/water  partition  coefficients  Oxidation  Rate 

One  major  limitation  to  the  use  of  such  models  has  been  the  lack  of  suitable  values  for 
many  of  the  properties  listed  above.  The  scarcity  of  data,  due  mainly  to  the  difficulty  and  cost 
involved  in  experimental  determination  of  such  properties,  has  resulted  in  an  increased  reliance 
on  the  use  of  estimated  values  for  many  applications. 

Quantitative  Structure-activity  relationships  (OSARs 

Quantitative  Structure-Activity  Relationships  (QSARs)  are  sources  of  such  data  that  are 
increasingly  recognized  as  rapid,  practical,  and  inexpensive  methods  with  which  to  estimate 
values  of  some  constants  or  properties  necessary  for  fate  assessment  models.  QSARs  are 
methods  by  which  data  or  information  on  the  properties  of  a  chemical  can  be  inferred  or 
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calculated  from  a  knowledge  of  the  structure  of  a  molecule  or  from  another  more  easily 
obtained  property  without  a  specific  concern  for  molecular  structure. 


Most  QSAR  methods  currently  used  to  estimate  contaminant  properties  fall  into  one  the 
following  categories  (Lyman,  1985): 

1 .  Correlations  between  the  property  of  interest  and  another  more  easily  obtained  property. 

2.  Correlations  between  the  property  of  interest  and  various  topological  indexes. 

3 .  Calculation  of  the  property  of  interest  using  fragment  or  group  contribution  methods. 

4.  Theoretical  equations,  generally  containing  parameters  that  are  experimentally  or 
empirically  derived. 

Correlations  between  the  property  of  interest  and  another  more  easily  obtained  property: 

One  of  the  most  useful  and  widely  used  type  of  estimation  method  is  a  simple  linear 
regression  between  two  properties.  Frequently  this  regression  is  expressed  in  terms  of  the  log 
of  the  two  properties.  Researchers  have  found  that  a  number  of  environmental  properties  can 
be  related  to  one  another  in  this  manner.  For  example,  octanol/water  partition  coefficient  ( kmv) 
has  been  used  to  estimate  soil  sorption  coefficients  (Karickhoff,  1979)  aqueous  solubility 
(Chiou  et  al,  1977  and  Mackay  et  al.,  1980),  bioconcentration  factors  (Neely  et  al..  1987. 
Chiou  et  al.,  1977),  and  aquatic  toxicity  (Koneman,  1980). 

One  important  limitation  in  using  this  approach  is  that  in  many  cases,  values  for  the 
pi  perty  used  to  estimate  the  property  of  interest  are  also  not  available.  In  addition,  when 
using  this  approach  it  is  essential  to  evaluate  the  data  used  to  generate  the  correlation 
expression.  In  many  instances  the  reliability  correlation  expression  was  derived  using  only  one 
chemical  class,  a  narrow  range  of  property  values,  poor  quality  data,  or  estimated  property 
values  in  the  regression  analysis.  It  is  also  important  not  to  use  a  regression  equation  outside 
of  the  range  of  data  from  which  it  was  derived. 

Fragment  constant  methods 

These  methods  generally  assume  that  a  single  numerical  value,  referred  to  as  a  fragment 
constant,  will  represent  the  contribution  of  a  specified  atom,  fragment  (a  group  of  atoms 
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bonded  together),  or  structural  factor  to  the  property  of  interest.  Probably  the  most  widely 
used  fragment  constant  method  has  been  developed  by  Hansch  and  Leo  (1979)  for  estimating 
octanol-water  partition  coefficients  (K^).  Usii.g  a  large  database  of  measured  values  of  K(m, 
fragment  constants  have  been  developed  for  over  160  atoms  or  fragments  and  for  a  variety  of 
structural  factors  (double  and  triple  bond,  ring  aromatic  rings,  etc.).  These  fragment  constants 
and  structural  factors  are  used  to  estimate  a  value  of  log  Kow  for  a  particular  chemical  using  the 
following  expression. 


^  (fragment  values)  +  (factor  values) 


(1) 


Another  example  of  the  fragment  constant  approach  to  predicting  properties  is  the 
UNAFAC  (UNIQUAC  Functional  Group  Activity  Coefficient)  solution  of  groups  method  of 
calculating  activity  coefficients.  The  UNIFAC  method  was  developed  to  estimate  activity 
coefficients  in  mixtures  of  nonelectrolytes  (Fredenslund  et  al.,  1977).  In  this  technique,  the 
activity  coefficient  is  divided  into  two  parts,  a  combinatorial  part  which  reflects  the  size  and 
shape  of  the  molecule  present  and  a  residual  portion  which  depends  on  functional  group 
interactions.  Various  parameters,  such  as  van  der  Waals  group  volumes  and  surface  areas  and 
group  interaction  parameters,  are  input  into  a  series  of  equations  from  which  the  combinatorial 
and  residual  parts  are  calculated.  Values  for  the  group  parameters  have  been  tabulated  and  can 
be  found  in  the  literature  (Frendenslund  et  al.,  1977  and  Gmehling,  1982).  Lyman  et  al. 
(1982)  give  several  examples  illustrating  the  use  of  this  technique. 

The  UNIFAC  method  was  used  by  Arbunkle  (1983)  to  calculate  the  activity  coefficients  for 
21  organic  compounds.  Solubility  values  were  than  calculated  from  the  UNIFAC  derived 
activity  coefficients  and  compared  to  experimental  values.  The  calculated  solubility  values 
were  generally  lower  than  the  experimental  values  and  the  largest  errors  were  genera 'ly 
associated  with  the  least  soluble  compounds. 

Lyman  (1985)  summarized  the  limitations  associated  with  the  use  of  fragment  constant 
methods  as:  most  fragment  constants  are  derived  from  compounds  with  no  more  that  one 
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functional  group,  most  are  limited  in  the  number  of  fragments  and  factors  they  cover,  meaning 
that  for  certain  types  of  compounds  *he  fragments  and  factors  are  unavailable:  fun  application  to 
structurally  complex  molecules  can  be  difficult;  many  make  no  distinction  between  positional 
isomers  of  a  compound;  and  they  may  be  difficult  to  use. 

Correlations  Between  the  Property  of  Interest  and  Topological  Indexes 

Topological  indexes  attempt  to  translate  molecular  structures  into  unique  characteristic 
structural  descriptors  that  can  be  expressed  numerically.  One  of  the  most  widely  used 
topological  index  is  the  molecular  connectivity  index  (MCI)  developed  by  Randic  (1972),  and 
refined  and  expanded  by  Kier  and  Hall  (1976,  1980,  1986).  Molecular  connectivity  is  a 
method  of  bond  counting  from  which  topological  indexes,  based  on  the  structure  of  the 
compound,  can  be  derived.  For  a  given  molecular  structure,  several  types  and  orders  of 
molecular  connectivity  indexes  (MCIs)  can  be  calculated.  Information  on  the  molecular  size, 
branching,  cyclization,  unsaturation,  and  heteroatom  content  of  a  molecule  is  encoded  in  these 
various  indices  (Kier  and  Hall,  1976).  Molecular  connectivity  has  been  utilized  to  predict  K(H 
(Sabljic,  1984;  Sabljic,  1987b),  S  (Kenaga  and  Goring,  1980),  and  other  physicochemical 
properties  of  chemical  compounds  such  as  Kow  (Doucette  and  Andren,  1988)  and 
bioconcentration  factors  (Briggs,  1981).  The  advantage  of  using  MCIs  to  predict  physical- 
chemical  properties  is  that  once  the  correlation  has  been  developed  only  the  structure  of  the 
chemical  of  interest  is  required  as  input.  No  additional  experimental  parameters  are  needed. 

Although  researches  have  been  successful  in  using  MCIs  to  estimate  properties  for  a  variety 
of  chemicals,  the  problem  of  class  specific  correlations  still  remains.  For  example,  Gerstl  and 
Helling  (1987)  evaluated  the  use  of  MCIs  in  estimating  log  Koc,  log  Kow  and  water  solubility 
for  many  types  of  pesticides  and  non-pesticides.  It  was  found  that,  while  good  predictions  of 
sorption  coefficients  were  possible  for  a  specific  groups  of  compounds,  the  ability  of  any  one 
equation  to  predict  log  Koc,  based  upon  one  or  two  MCIs,  was  rather  low  for  diverse 
compound  types.  In  addition,  calculation  of  MCIs  can  be  difficult,  especially  the  higher-order 
indices  for  complex  molecules. 


7 


Theoretically  Derived  Equations 

Two  examples  of  using  theoretical  equations  to  estimate  properties  of  environmental 
interest  include  the  estimation  of  a  compound's  vapor  pressure  from  its  boiling  point  and  the 
calculation  of  Henry's  Law  constant  from  the  ratio  of  a  compound's  vapor  pressure  to  its 
aqueous  solubility. 

Using  a  72  compound  test  set  of  hydrocarbons  and  halocarbons,  Mackay  et  al.  (1982) 
developed  an  expression  which  enables  the  estimation  of  a  compound's  vapor  pressure  from  its 
boiling  point.  This  equation  was  derived,  in  part,  from  the  Clausius-Clapeyron  equation,  itself 
derived  from  the  second  law  of  thermodynamics.  Mackay  et  al.  noted  that  this  expression  may 
not  be  applicable  to  other  classes  of  compounds  and  that  method  errors  increase  as  vapor 
pressure  decreases. 

Another  widely  used  example  of  a  theoretically  derived  method  is  the  calculation  of 
Herny's  Law  constants,  H,  from  a  compound's  vapor  pressure/aqueous  solubility  ratio.  H  is 
defined  as  the  ratio  of  a  chemicals  concentration  in  air  to  its  concentration  in  water  when  those 
two  phases  are  in  contact  and  at  equilibrium.  The  derivation  of  this  expression  requires  the 
assumption  that  liquid  phase  activity  coefficients  are  constant  up  to  the  aqueous  solubility  limit. 
Thus,  the  method  is  not  applicable  to  compounds  with  high  water  solubilities. 

Problems  Associated  with  the  Estimation  Methods  (or  the  need  for  a  decision  support  system) 

In  most  cases,  more  than  one  estimation  method  is  available  for  a  particular  input 
parameter.  Estimation  methods  however,  have  widely  varying  accuracies  and  indiscriminate 
use  of  these  techniques  can  result  in  large  errors. 

Selection  and  application  of  QSARs  methods  requires  varying  degrees  of  expertise  that 
depend  on  the  structure  of  a  particular  chemical  of  interest,  knowledge  of  the  mechanism  of  the 
process,  the  extent  of  the  database  used  to  develop  the  QSAR,  and  the  complexity  of  the 
structural  analysis  required  to  relate  structure  to  the  property.  For  example,  some  QSARs  are 
broader  than  others  in  the  range  of  chemicals  that  are  covered,  and  some  methods  have  been 
established  with  a  better  understanding  of  the  mechanisms  or  properties  involved.  In  many 
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cases  estimation  methods  are  developed  from  empirical  or  semiempirical  correlations.  The 
success  of  the  correlation  is  dependent  on  many  factors  including  the  type  and  number  of 
compounds  used  in  its  development. 

Incorporation  of  QSARs  into  a  computer  format  is  a  logical  and  necessary  step  to  gain  full 
advantage  of  the  methodologies  for  simplifying  fate  assessment.  A  practical  computerized 
property  estimation  program,  utilizing  QSARs,  should  include  the  following  attributes:  be 
simple  and  flexible  to  use  for  both  experts  and  non-experts,  include  sufficient  statistical 
information  regarding  the  development  of  the  QASARs  so  that  the  range  of  applicability  of 
such  models  can  be  evaluated,  and  provide  an  indication  of  the  accuracy  of  the  estimated 
property.  A  microcomputer-based  system  for  the  estimation  of  parameters  necessary  for  fate 
assessment  models  would  be  of  great  benefit  to  USAF  agencies  responsible  for  environmental 
fate  assessment. 

STATUS  OF  RESEARCH  EFFORT 

Introduction 

After  evaluating  a  variety  of  software  approaches,  including  several  expen  systems  shells, 
we  decided  to  build  the  property  estimation  system  using  Apple  HyperCard1'1  software.  This 
approach  will  enable  us  to  efficiently  build  a  flexible,  simple-to-use  interface  between  the 
various  modules  or  subroutines  of  our  property  estimation  system.  This  approach  will  also 
permit  additional  property  estimation  routines,  as  they  are  developed,  to  be  easily  added. 

The  following  sections  will  describe  the  development  and  use  of  the  HyperCard™-based 
Property  Estimation  Program,  PEP,  and  its  associated  chemical  property  Database  (DB). 
Overview  of  PEP  and  Chemical  Property  Database 

PEP  is  currently  comprised  of  three  property  estimation  modules  linked  to  a  chemical 
property  database.  The  three  property  estimation  modules  utilize  MCI-property  relationships, 
UNIFAC-derived  activity  coefficients,  and  property-property  correlations  for  compound 
property  estimates.  The  modular  organization  of  PEP  is  illustrated  in  Figure  1 . 
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View  Results 
and  Estimate 
of  Accuracy 


Figure  1 .  Row  chart  overview  of  PEP.DB 

COMPUTER  HARDWARE/SOFTWARE  REQUIREMENTS  AND  DEVELOPMENT  TOOLS 
PEP  Software  Overview 

The  PEP  software  system  is  a  HyperCard™  based  program  that  runs  on  Apple  Macintosh 
computers.  HyperCard  is  an  information/management  program  included  with  the  purchase  of 
Macintosh  computers.  HyperCard  offers  graphics,  information  storage,  the  means  to  display 
information  in  a  variety  of  formats,  the  ability  to  establish  links  between  related  information,  a 
high  level  language  (HyperTalk),  the  ability  to  extend  HyperTalk  by  writing  new  commands  in 
a  compiled  language,  and  a  mechanism  to  transfer  control  to  other  Macintosh  applications.  The 
PEP  system  uses  all  these  features. 

PEP  Software  Components 

PEP  uses  a  variety  of  programs  to  control  the  user  interface,  manage  the  HyperCard 
Stacks,  and  make  various  computations.  These  tools  are:  HyperTalk  Scripts,  HyperTalk 
External  Commands  (XCMDs),  HyperTalk  External  Functions  (XFCNs)  and  external 
application  programs. 
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HyperTalk  Scripts 

HyperCard™  contains  a  high  level,  interpreted  language  called  HyperTalk.  PEP  makes 
extensive  use  of  HyperTalk.  A  HyperTalk  program  is  called  a  script.  PEP  uses  scripts  to 
control  the  user  interface,  the  stack  to  stack  linkage,  and  the  linkages  between  the  cards  in  each 
stack.  This  is  the  standard  way  in  which  HyperTalk  is  normally  used.  Like  most 
HyperCard™  applications,  PEP  uses  scripts  at  all  levels  in  the  HyperCard™  hierarchy.  Scripts 
are  used  at  the  button,  field,  card,  background,  and  stack  levels  in  each  of  the  PEP  stacks. 

Besides  controlling  each  stack,  scripts  also  do  some  of  the  computations  in  the  system. 
For  example,  a  script  is  used  to  compute  estimates  of  the  chemical/physical  properties  in  the 
PEP  processors  as  a  function  of  the  MCIs. 

HyperCard  External  Commands  and  Functions 

PEP  contains  several  external  commands  and  functions.  A  HyperCard™  external 
command  is  an  extension  to  the  HyperTalk  language.  HyperCard™  externals  in  PEP  are 
written  in  the  language  C.  External  commands  and  functions  are  used  for  several  reasons:  to 
do  functions  not  supported  by  HyperTalk,  to  improve  the  speed  of  some  computations,  and  to 
improve  the  structure  of  a  software  module. 

External  Applications 

External  applications  are  external  computer  programs  that  can  be  run  independent  of 
HyperCard™.  HyperCard™  provides  a  means  (the  open  command)  to  transfer  control  to 
another  application.  When  the  application  ends,  control  returns  to  HyperCard™.  PEP  uses 
three  applications: 

1 .  ChemDraw™,  by  Cambridge  Scientific  Computing,  Cambridge,  Mass. 

2.  Chemintosh™,  by  SoftShell  International  Ltd,  Grand  Junction,  CO 

3 .  EstimateMCI,  by  Utah  Water  Research  Laboratory,  Logan,  Utah 

ChemDraw™  and  Chemintosh™  are  commercial  applications.  They  are  used  by  the  PEP 
processor  stack  to  provide  a  means  for  the  user  to  create  a  connection  table  for  subsequent 
input  to  PEP.  These  programs  are  not  distributed  with  PEP.  EstimateMCI  is  an  application 
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designed  and  coded  by  the  PEP  development  team.  This  is  a  C  program.  It  is  distributed  with 
PEP.  Estimate  MCI  accepts  a  SMILES  string  or  the  contents  of  a  connection  table  file  for  its 
input.  It  then  computes  the  MCIs  as  a  function  of  the  SMILES  string  or  the  connection  table. 
It  communicates  with  the  PEP  processor  stack  by  passing  and  receiving  information  through 
external  files.  (See  the  prologue  in  the  source  code  file  MCI.C  for  a  detailed  description  of  this 
interface.) 

PEP  Software  Tools 

The  PEP  software  uses  a  five  software  tools.  These  are: 

1 .  HyperCard,  version  1.2.5,  by  Apple  Computer,  Inc. 

2.  Think  C,  version  4.0,  by  Symantec  Corporation,  Cupertino,  California 

3.  XTRA,  the  XFCN,  XCMD  toolkit,  by  Adrian  Freed,  Fidcor  USA.  Louisville,  Colorado 

4.  ResEdit,  version  1.2,  by  Apple  Computer,  Inc. 

5.  Progress  XCMD,  by  Jay  Hodgdon,  587  Cutwater  Lane.  Foster  City.  Ca.  94404 
HyperCard™ 

This  is  the  foundation  of  the  system.  All  modules  in  the  PEP  system  are  based  on 
HyperCard™. 

Think  C 

All  external  commands  and  functions  (XCMDs,  and  XFCNS)  are  written  in  the  language 
C,  and  compiled  with  version  4.0  of  Symantec  Corporation's  Think  C  compiler.  They  are 
compiled  as  Macintosh  code  resources.  The  application,  estimate  MCI,  was  also  written  in  C 
and  compiled  with  the  Think  C  compiler.  This  was  compiled  as  a  Macintosh  application. 
XTRA 

XTRA  is  a  commercial  product  that  eases  the  burden  of  writing  HyperTalk  external 
commands  and  functions.  This  product  provides  an  interface  between  HyperCard™  and  code 
resources  (external  commands  or  functions)  compiled  with  the  Think  C  compiler.  In 
Macintosh  terminology,  this  interface  is  called  glue.  The  XTRA  program  also  contains  a 
library  of  useful  functions  for  use  by  HyperCard™  external  commands. 
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ResEdit 

ResEdit  is  a  Macintosh  resource  editor  by  Apple  Computer,  Inc.  ResEdit  is  used  to  attach 
each  HyperCard™  external  command  and  function  to  the  appropriate  HyperCard  stack. 
Progress  XCMD 

The  progress  XCMD  is  a  shareware  product.  This  XCMD  displays  a  dialog  box  with  a 
moving  cursor  to  show  how  far  along  (e.g.  %  complete)  time  consuming  scripts  while 
computations  are  done. 

System  Requirements 

The  PEP  system  requires  the  following  system  configuration  to  run:  a  Macintosh  Plus, 
Macintosh  SE,  or  Macintosh  II  computer,  with  a  hard  disk;  HyperCard  software;  Macintosh 
system  software  version  5.0  or  greater,  running  under  MultiFinder;  and  a  minimum  of  2 
megabytes  of  memory  (RAM),  with  1000  kBytes  of  memory  allocated  for  HyperCard. 

CHEMICAL  PROPERTY  DATABASE 

Description 

Experimentally  determined  chemical  property  data  was  complied  from  a  variety  of  literature 
sources  and  computerized  databases.  Using  this  information,  a  chemical  property  database 
was  developed  using  HyperCard™.  This  database  was  used  for  developing  MCI-property  and 
property-property  relationships  and  is  a  major  component  of  the  overall  property  estimation 
software  system  being  developed.  In  its  current  state,  the  database  includes  the  following 
information:  compound  name  and  synonyms,  CAS  number,  chemical  formula,  molecular 
weight,  boiling  point,  melting  point,  aqueous  solubility,  octanol/water  partition  coefficient, 
vapor  pressure,  soil/water  sorption  coefficients,  Henry’s  Law  constants,  bioconcentration 
factors  and  appropriate  references  for  each  value.  The  number  of  compounds  currently  in  the 
database  for  each  property  is  summarized  in  Table  1. 
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TABLE  1.  Current  number  of  compounds  in  chemical  property  database  listed  by  property 


Property  name 

#  compounds 

Aqueous  solubility 

365 

Octanol/water  partition  coefficient 

196 

Soil  sorption  coefficient  (organic  carbon  normalized) 

171 

Vapor  Pressure 

95 

Henry's  Law  constants 

76 

Bioconcentration  Factors 

70 

The  Chemical  Property  Database  also  provides  the  means  for  the  user  to  search  for 
chemical  compounds,  to  sort  the  compounds  by  name,  boiling  point,  melting  point,  or 
molecular  weight,  and  the  ability  to  transfer  to  any  of  the  PEP  modules.  The  chemical  property 


database  screen  is  illustrated  in  Figure  2. 
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Figure  2.  View  of  PEP's  chemical  property  database 
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MCI-BASED  PROPERTY  ESTIMATION  MODULE 

Overview 

Upon  entering  the  MCI  module  the  user  must  first  input  the  necessary  structural 
information.  The  user  can  input  the  required  structural  information  using  either  SMILES 
(Simplified  Molecular  Input  Line  Entry  System)  notation  or  connection  tables  generated  from 
two  commercially  available  two-dimensional  drawing  programs,  ChemDraw™  or 
Chemlntosh™.  A  detailed  description  of  SMILES  will  be  incorporated  as  a  help  option  in  the 
near  future.  After  the  structural  information  is  entered,  MCIs  are  then  calculated  using  an 
application  external  to  HyperCard™.  The  calculation  of  MCIs  will  be  described  in  detail  in  a 
following  section.  After  the  MCIs  are  calculated,  the  results  are  imported  back  into 
HyperCard™  where  they  can  be  displayed.  Upon  importing  the  MCI  the  user  can  then  choose 
which  property  are  to  be  estimated.  Several  MCI-property  regression  models  are  available  for 
each  property.  A  view  statistics  option  is  available  to  aid  the  user  in  choosing  the  most 
appropriate  model.  After  choosing  the  most  appropriate  regression,  estimates  for  the  selected 
properties  can  be  made.  The  MCI  module  results  window  provides  an  estimate  of  the  property 
along  with  its  calculated  accuracy  based  on  the  95%  confidence  interval  calculated  from  the 
regression.  The  overall  operation  of  the  MCI  module  is  illustrated  in  Figure  3. 

Calculation  of  MCIs 

To  calculate  the  MCIs  for  a  given  compound,  a  delta  (d)  value  must  first  be  assigned  to 
each  atom  in  the  structure.  Three  main  d  values  were  computed  in  this  study:  normal,  bond, 
and  valence.  Normal  deltas  were  computed  by  summing  the  number  of  bonds  (single,  double, 
etc.  are  counted  as  one  bond)  connected  to  the  atom  whose  delta  is  being  calculated.  The  bond 
deltas  were  calculated  the  same  way  as  the  normal  deltas  except  the  bonds  w'ere  taken  at  their 
face  value  (single  is  one,  double  is  two,  etc.)  instead  of  each  bond  being  equal  to  one.  Valence 
deltas  for  each  atom  were  computed  according  to  equations  (2)  and  (3)  (Kicr  and  Hall,  1986): 
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Figure  3.  Flow  chart  depicting  operation  of  MCI  module 


dv  =  Zv  -  h 


(2) 


dv  =\f(Zv  -  h,Z  -  Zv)  (3) 

where  dv  is  the  valence  delta,  Zv  is  the  number  of  valence  electrons  in  the  atom,  h  is  the 
number  of  hydrogen  atoms  bound  to  the  atom,  and  Z  is  the  atomic  number  of  the  atom. 
Equation  (1)  is  used  for  those  atoms  in  the  first  row  of  the  periodic  chart,  and  equation  (2)  is 
used  for  all  other  atoms.  An  example  delta  calculation  for  phenol  is  shown  in  Figure  4. 
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Figure  4.  Delta  values  calculated  for  phenol. 
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Once  the  delta  values  have  been  calculated  for  each  atom  in  the  molecule,  simple,  bond  and 
valence  indices  of  different  orders  and  types  can  be  calculated  for  a  given  molecule.  The  order 
refers  to  the  number  of  bonds  in  the  skeletal  substructure  or  fragment  used  in  computing  the 
index:  zero  order  defines  individual  atoms,  first  order  uses  individual  bond  lengths,  second 
order  uses  two  adjacent  bond  combinations,  and  so  on.  The  type  refers  to  the  structural 
fragment  (path,  cluster,  path/cluster  or  chain)  used  in  computing  the  index  the  MCIs 
corresponding  to  the  desired  graph  fragment  types  can  be  calculated.  The  fragment  types  are 

derived  from  graph  theory  and  are  best  described  by  example,  as  shown  in  Figure  5. 

C 

C-C-C-C  c-c 

\ 

3rd  order  path  C 

3rd  order  cluster 


C 

c-c-c' 

% 

c 

4th  order  path-cluster 


c^c 

3rd  order  chain 


Figure  5.  Four  types  of  graph  fragments. 

Only  path  indices  are  possible  for  orders  less  than  3.  The  symbol  represents  a  simple 

second  order  index  whereas  the  symbol  represents  a  first  order  valence  index. 

Finally,  to  calculate  the  MCIs,  the  following  equation  is  used  (Kier,  1980): 

n 


I 

;  -  i 


(4) 


where  5l  is  the  delta  of  type  t  determined  as  above,  n  is  the  total  possible  number  of  m*  order 
indices  in  the  molecule,  m  is  the  number  of  bonds  over  which  the  deltas  are  taken,  t  is  the  type 
of  indices  (normal,  bond,  or  valence),  and  g  is  one  of  the  four  graph  types  (path,  cluster,  path- 
cluster,  or  chain). 
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The  MCI  calculation  routine  PEP  calculates  simple,  bond  and  valence  indices  of  several 
types  (path,  cluster,  chain,  and  path/cluster)  and  orders  (0  through  6),  if  possible,  for  each 
molecule,  resulting  in  a  maximum  of  54  index  values  for  each  molecule. 

To  account  for  non-dispersive  force  effects  on  aqueous  solubility  and  solubility  related 
properties  zero  through  six  order  A  valence  path  indices  (A/),  as  described  by  Bahnick  and 
Doucette  (1988),  are  calculated  by  PEP,  in  addition  to  the  54  indices  described  above.  To 
calculate  A%  indices,  a  nonpolar  equivalent  is  made  by  substituting  C  for  O  or  N  atoms.  MCIs 
are  calculated  for  the  nonpolar  equivalent  and  values  for  Ax  can  be  computed  for  each  type  of 
index  by: 

A)C  =  (X)np  -  X  (5) 

DEVELOPMENT  OF  MCI-PROPERTY  RELATIONSHIPS 
For  each  property,  MCI-property  relationships  were  developed  for  both  general  and 
specific  chemical  classes.  The  database  compounds  were  classified  into  four  general  groups: 
Non-polar  aromatics  (compounds  having  no  O  or  N  containing  functional  groups),  polar 
aromatic  (compounds  having  O  or  N  containing  functional  groups),  non-polar  non-cyclic  non¬ 
aromatics,  polar  non-cyclic  non-aromatics  and  specific  chemical  classes  such  as  PCBs,  PAHs, 
carbamates,  ureas  etc.  In  addition,  "universal"  equations  were  developed  which  utilized  all 
database  compounds  having  values  for  a  specific  property. 

Two  approaches  were  used  to  choose  the  most  appropriate  variable(s)  in  developing  the 
MCI-property  regression  equations.  The  first  approach  used  a  combination  of  two  indices,  one 
related  to  molecular  size  or  dispersive  intermolecular  forces  (i.e.,  vpO,  vpl,  npO,  npl,  bpO  and 
bpl)  and  one  related  to  the  non-dispersive  forces  (Ax).  The  second  approach  relied  entirely  on 
a  stepwise  multiple  linear  regression  program  to  select  the  most  appropriate  variable.  If  the  the 
two  approaches  resulted  in  models  of  similar  fit  the  equation  resulting  from  the  first  approach 
was  used  because  of  its  greater  conceptual  meaning. 


18 


The  MCI-property  relationships  which  have  been  developed  so  far  are  presented  in 
Appendix  A  along  with  the  relevant  regression  statistics.  The  information  obtained  in  the  first 
year  of  the  project  shows  that  MCls  can  be  used  to  predict  property  values  for  a  variety  of 
organic  chemical  types  using  both  class  specific  and  more  general  regression  equations.  The 
universal  and  more  general  regression  models  utilize  Ax  indices.  These  non-dispensive  force 
terms  are  important  in  predicting  physical  properties  for  molecules  exhibiting  substantial 
hydrophilicity.  The  predicted  property  values  are  within  the  experimental  uncertainty  in  their 
measurement  for  the  vast  majority  of  chemicals  investigated. 

In  addition,  an  improvement  in  predicted  values  for  some  of  the  compounds  in  the  study 
could  be  realized  by  adjusting  the  assigned  valance  values.  This  will  be  further  investigated  in 
the  second  year  of  study. 

STATISTICAL  EVALUATION  OF  MCI-PROPERTY  RELATIONSHIPS 


All  regression  equations  used  in  MCI-property  relationships  (and  in  the  property/property 
correlations  discussed  in  a  later  section)  were  evaluated  for  their  statistical  significance  of 
regression  variables.  Each  of  these  steps  were  independently  performed  for  each  regression 
relationship  that  was  developed  from  experimental  data. 

Examination  of  Residuals 

Residuals,  es,  are  defined  mathematically  as  the  difference  between  the  logl0  of  an 
experimentally  determined  value,  Y,  and  its  predicted  value,  Y.' : 


e,  =  log10(Yi  -  Y.'),  i  =  1,  ...no 


(6) 


Regression  analyses  are  performed  subject  to  a  number  of  important  assumptions  relative  to 
the  nature  of  their  residuals  (Draper  and  Smith  1981):  1)  they  are  assumed  to  be  independent 
of  one  another  in  a  data  set,  2)  they  are  assumed  to  have  a  mean  of  zero,  3)  they  are  assumed  to 
have  constant  variance,  and  4)  they  are  assumed  to  be  normally  distributed.  If  the  regression 
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relationship  is  a  true  representation  of  the  experimental  data  and  includes  all  significant 
variables,  then  its  residuals  should  confirm  the  assumptions  made  above. 


A  test  of  the  validity  of  a  developed  regression  expression  can  then  be  based  on  an 
assessment  of  the  validity  of  these  assumptions  regarding  its  residuals  through  an  examination 
of  residual  plots  (residuals  versus  measured  values,  residuals  versus  predicted  values,  a  normal 
probability  plot  of  residual  values)  and  calculated  statistics. 

Plots  of  residual  values  versus  measured  and  predicted  values  will  generally  take  the  form 
of  the  relationships  shown  in  Figure  6.  Figure  6a  indicates  residuals  that  verify  the 
assumptions  made  above  being  independent  with  constant  variance,  and  normally  distributed 
about  a  zero  mean.  The  relationship  shown  in  Figure  6b  iod  cates  either  a  lack  of  independence 
of  the  residual  values,  or  a  regression  model  that  does  not  adequately  represent  oh  erved  data, 
i.e.,  a  linear  model  not  adequately  representing  a  elationship  that  has  curvature.  Figure  6c 
indicates  non-constant  variance,  while  the  residual  plot  in  Figure  6d  indicates  error  in  the 
analysis  such  as  die  absence  of  the  y-intercept  term  in  a  regression  model.  The  assumption  o! 
normally  distributed  residuals  can  also  be  evaluated  from  a  normal  probability  plot  of  residual 
values.  These  residuals  can  be  said  to  be  normally  distributed  if  their  normal  probability  plot  is 
linear. 


a.  b. 


Figure  6.  Examples  of  residual  plots. 
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Figure  6.  (Cont'd) 


Deviations  from  the  "correct"  plot  behavior,  i.e.,  Figure  6a,  does  no!  automatically  mean 
that  there  is  an  error  in  the  regression  relationship  or  in  the  assumptions  of  normality,  constant 
variance,  etc.,  of  the  residuals.  This  is  particularly  true  for  regression  relationships  developed 
from  a  small  number  of  experimental  observations.  Gross  deviations  from  ideal  behavior 
should  be  identified,  and  should  be  flags  for  further  testing  of  the  validity  of  assumptions 
regarding  the  data  set  being  analyzed  as  described  by  Anscombe  and  Tukey  (1963)  and  Draper 
and  Smith  (1981). 

Analysis  of  variance 

When  applied  to  regression  analyses,  the  analysis  of  variance  (ANOVA)  is  a  test  for  the 
significance  of  the  regression  relationship,  i.e.,  are  the  regression  variables  a  significantly 
better  descriptor  of  the  behavior  of  the  data  than  its  mean  value.  An  example  ANOVA  table  is 
presented  in  Table  2.  In  this  table,  the  significance  of  the  regression  relationship  is  indicated 
by  the  F  ratio,  which  represents  the  ratio  of  the  variance  explained  by  the  regression  to  that 
explained  by  the  residuals.  The  F  value  for  a  given  relationship  is  compared  to  a  table  of 


critical  values  for  the  F  distribution  for  the  appropriate  number  of  regression  (n  )  and  residua! 
(n,)  degrees  of  freedom.  If  the  F  value  for  the  regression  is  found  to  be  greater  than  the  critical 
F  value  at  a  given  confidence  level,  the  regression  is  significant  at  this  confidence  level. 


Table  2.  Sample  ANOVA  Table 


Source 

dft 

sstt 

MSttt 

F  ratio 

a§ 

Regression 

1 

6.326 

6.326 

6.569 

<0.01 

Residual 

22 

21.192 

0.963  =  s2 

Total,  corrected  SS 

23 

27.518 

t  degrees  of  freedom 
tt  sum  of  squares 
tit  mean  square  =  SS/df 
§  probability  of  significance  test 

Table  2  can  be  used  to  calculate  the  coefficient  of  determination,  r2: 

r2  =  SS/Total,  corrected  SS  (7) 

a  value  which  indicates  the  proportion  of  the  total  variation  about  the  mean  regression  line  that 
is  described  by  the  regression  equation.  Table  2  data  also  allows  the  calculation  of  the  standard 
deviation  of  the  regression,  s,  which  is  the  square  root  of  the  residual  mean  square.  This  value 
of  s  can  be  interpreted  as  the  average  residual,  or  average  precision  of  the  predicted  value 
generated  from  the  regression  equation. 

Student's  t-test  for  the  significance  of  variables 

The  coefficients  estimated  from  the  regression  relationship  are  not  known  exactly,  as  each 
regression  coefficient  has  a  corresponding  standard  error.  The  Student's  t  value  is  the  ratio  of 
a  regression  coefficient  value  to  its  standard  error,  and  indicates  whether  the  coefficient  value  is 
significantly  different  from  zero,  i.e.,  whether  it  can  be  considered  for  inclusion  in  the 
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regression  equation.  The  t  value  for  a  given  coefficient  is  compared  to  values  in  a  standard  t 
table  for  n  -  1  degrees  of  freedom,  where  n  =  the  number  of  observations,  to  determine  the 
probability  level  for  the  hypothesis  that  the  coefficient  is  not  statistically  different  from  zero.  It 
the  probability  from  the  standard  t  table  is  less  than  a  specified  probability  level,  0.05  in  this 
study,  then  the  variable  is  identified  as  significant,  and  is  included  in  the  regression  equation. 
Precision  of  the  predicted  value 

As  described  above,  the  standard  deviation  of  the  regression  relationship,  s,  is  a  measure  of 
the  average  precision  of  predicted  values.  However,  as  indicated  by  Draper  and  Smith  (1981). 
the  precision  of  of  the  predicted  values  depends  on  the  values  of  the  independent  variables. 
These  authors  indicate  that  the  s  value  underestimates  the  uncertainty  associated  with  the 
predicted  value,  and  that  a  better  measure  of  the  precision  of  Yd  at  given  values  of  the 
independent  variable  Xo  is  given  by  the  matrix  equation: 


Y0  =  Y0'±s  V 


1  +  Xq’  (X’X)'1  xc 


(Si 


where  Y0  is  the  predicted  value  at  a  given  value  of  X  with  the  precision  described  as  the 
product  of  s  times  the  term  under  the  radical,  X  is  the  matrix  of  independent  variable  values, 
and  Xo  is  the  vector  of  independent  variables  at  which  predicted  values  of  Y0  are  desired.  As 
the  number  of  observations  increases,  i.e.,  n0bs  >  500,  the  term  under  the  radical  approaches 
one,  and  Equation  7  simplifies  to  the  standard  form: 

Yo  =  Yo'±s  (9) 

Preliminary  Results  from  MCI-Propertv  Relationships 

The  relationship  between  estimated  and  experimental  log  Kow  is  illustrated  in  Figure  7. 

The  estimated  log  Kow  values  were  calculated  using  the  universal  MCI-log  K0w  regression 
model.  A  significant  improvement  is  observed  when  the  four  general  class  MCI-log  K0w 
regression  models  (aromatic  polar,  aromatic  non-polar,  non-aromatic  non-polar,  and  non- 
aromatic  polar)  are  used  in  combination  as  shown  in  Figure  8. 
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esl.  log  Kow  (MCI  Universal) 


y  =  ,7396x  +  .8987,  r 


2  =  .7058 


exp.  log  Kow 


Figure  7.  Experimental  versus  estimated  (MCI  Universal)  log  K, 


y  =  .8745x  +  .S348,  r  2  =  .7948 


exp.  log.  Kow 


Figure  8. 


Experimental  versus  estimated  (MCI 


four  general 


equa' 
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For  comparison,  log  K0w  estimated  using  the  computerized  group  contribution  method 
(ClogP)  of  Leo  and  Hansch  are  plotted  against  experimental  log  Kow  values  for  the  same 
compounds  used  in  Figures  7  and  8  (Figure  9).  A  comparison  between  properties  estimated  by 
PEP  and  those  estimated  by  other  established  methods  will  be  further  explored  in  the  project's 
second  year. 


y  =  .9625x  +  .2573,  r  2  =  .8303 


O  est.  log  Kow  (ClogP) 


-4  -2  0  2  4  6  8 

exp.  log  Kow 

Figure  9.  Experimental  versus  estimated  (ClogP)  log  Kow. 

UNIFAC  MODULE 

Overview 

As  discussed  in  the  background  and  significance  section  UNIFAC-derived  activity 
coefficients  have  been  used  to  estimate  values  for  aqueous  solubility  and  octanol/water  partition 
coefficients.  However,  since  the  UNIFAC  approach  is  a  solution  of  groups  method,  its 
operation  is  different  than  the  MCI  or  property-property  modules  which  are  based  on 
regression  analysis. 
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Upon  entering  the  UNIFAC  module  the  user  must  first  choose  the  property  to  be  estimated. 
Currently,  the  UNIFAC  module  allows  for  the  estimation  of  aqueous  solubility  and 
octanol/ water  partition  coefficients.  We  have  also  included  several  additional  properties  which 
are  of  interest  in  environmental  fate  modeling:  solubility  in  water/organic  cosolvent  systems, 
solubility  in  organic  solvents,  and  oil/water  partition  coefficients.  Only  the  aqueous  solubility 
and  octanol/water  partition  coefficient  options  are  currently  functional. 

Upon  entering  the  UNIFAC  module  the  user  must  first  choose  the  property  to  be  estimated. 
After  choosing  the  property,  the  required  structural  information  must  be  input  using  one  of 
three  methods:  manual,  connection  table  (option  not  currently  implemented),  or  SMILES 
string.  The  manual  input  method  requires  the  user  to  dissect  the  molecule  into  its  appropriate 
UNIFAC  functional  groups.  A  table  of  UNIFAC  structural  groups  is  provided  for  this 
purpose.  If  either  the  SMILES  or  connection  table  option  is  used  the  molecule  is  automatically 
dissected  into  its  UNIFAC  structural  groups.  Once  the  groups  are  selected  the  activity 
coefficent(s)  can  be  calculated.  The  UNIFAC  groups  are  displayed  so  that  the  user  can  verify 
their  correctness.  Figure  10  illustrates  the  overall  operation  of  the  UNIFAC  module. 

lUNlFAC  Module  | 


Figure  10.  Flow  chart  depicting  operation  of  UNIFAC  module. 


Calculation  of  UNIFAC  derived  activity  coefficients 

UNIFAC  (UNIQUAC  Functional  Group  Activity  Coefficient)  is  a  solution  of  groups 
method  for  calculating  activity  coefficients  which  requires  only  the  structure  of  the  compound 
as  input.  The  calculated  activity  coefficients  can  be  then  used  to  estimate  aqueous  solubility, 
octanol/water  partition  coefficients  and  related  properties  such  as  solubility  in  organic  solvents, 
solubility  in  mixed  water  organic  solvents,  and  oil/water  partition  coefficients. 

In  this  technique,  UNIFAC  calculates  activity  coefficients  by  dividing  them  into  two  parts, 
a  combinatorial  part  which  reflects  the  size  and  shape  of  the  molecule  present  and  a  residual 
portion  which  depends  on  the  functional  group  interactions: 

In  Yi  =  In  ■/'  +  In  (10) 

where  y\  is  the  activity  coefficient  for  the  ith  molecular  component  in  the  mixture.  The 
superscripts  refer  to  the  combinatorial  (c)  and  residual  parts  (R) 

Various  parameters,  such  as  van  der  Waals  group  volumes  and  surface  areas  and  group 
interaction  parameters,  are  input  into  a  series  of  equations  from  which  the  combinatorial  and 
residual  parts  are  calculated.  Values  for  the  group  parameters  have  been  tabulated  and  can  be 
found  in  the  literature  (Fredenslund  et  al.,  1977  and  Gmehling,  1982).  The  group  parameters 
of  Gmehling  (1982)  are  used  in  the  PEP  UNIFAC  module. 

Estimation  of  aqueous  solubility  and  octanol/water  partition  coefficients  for  UNIFAC  derived 
activity  coefficients 

UNIFAC  derived  activity  coefficients  can  be  used  to  calculate  aqueous  solubility.  The 
aqueous  solubility  (S)  of  an  organic  liquid  can  be  approximated  as  follows: 

S  =  55.5  aorg/yaq  (11) 


where  aorg  js  the  activity  of  the  liquid  in  the  organic  phase  and  yaq  is  the  activity  coefficient  in 
the  aqueous  phase.  For  hydrophobic  compounds,  aorg  approximately  one,  since  water  does 
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not  significantly  dissolve  in  the  organic  phase.  Thus  the  aqueous  solubility  of  an  organic  liquid 
can  be  estimated  from  the  following  expression: 

S  =  55/(yaq)UNIFAC  (12) 

where  (Yaq^UNIFAC  *s  UNIFA  C-derived  activity  coefficient  at  infinite  dilution.  For 
solids,  the  solubilities  must  be  corrected  to  those  of  the  corresponding  supercooled  liquids 
using  the  following  expression  (Yalkowski  et  al.,1980): 

1°§  ^supercooled  liquid  =  l°g  Ssolid  =  0.01(MP  -  25)  (13) 

where  MP  is  the  compound's  melting  point  in  °C. 

Propertv/propertv  correlation  module 

The  property  property  correlation  module  is  not  fully  implemented  as  the  current  time. 
Only  a  limited  number  of  property/property  correlations  have  been  developed  using  the  data 
collected  in  this  study.  A  prototype  of  the  module  has  been  developed  using  property- property 
correlations  published  in  the  literature  along  with  several  general  relationships  developed  in  the 
first  year  of  this  project.  The  property-property  module  should  be  fully  implemented  with 
several  months.  A  flow  chart  describing  the  operation  of  this  prototype  module  is  presented  in 
Figure  1 1. 

|  Property/Property  Correlation  Module  | 


Figure  11.  Flow  chart  depicting  operation  of  UNLFAC  module 


Upon  entering  this  module  the  user  must  first  choose  the  property  to  be  estimated.  Once 
this  is  done  the  program  then  displays  the  available  regression  models  along  the  required  input 
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properties.  At  this  time  the  user  has  the  option  of  viewing  the  regression  model  statistics. 
Once  the  user  chooses  the  most  appropriate  regression  model  and  inputs  the  required 
information,  the  property  can  be  estimated.  The  results  of  the  property  estimation  can  then  be 
displayed  along  with  an  estimation  of  its  accuracy  based  on  the  statistical  approach  described  in 
the  MCI  section. 

A  summary  of  the  property-property  correlations  developed  in  the  first  year  of  study  are 
presented  in  Appendix  B.  It  is  anticipated  that  the  module  will  be  completed  in  several  months. 

SUMMARY  OF  FIRST  YEAR  ACCOMPLISHMENTS 

Chemical  property  data  were  compiled  for  over  700  compounds  from  a  variety  of  literature 
sources  and  computerized  databases.  Only  experimentally  measured  properties  were  used. 
Using  this  information,  a  chemical  property  database  was  created  and  used  for  developing 
MCI-property  and  property-property  relationships  which  were  then  incorporated  into  a 
prototype  microcomputer  based  property  estimation  program,  referred  to  as  PEP. 

The  property  estimation  program,  PEP,  is  a  decision  support  system,  developed  using 
HyperCard™  software,  which  utilizes  MCI-property,  property-property,  and  UN1FAC 
modules  to  provide  the  user  with  several  approaches  to  estimate  physical  properties  .  The 
current  version  of  PEP  provides  estimates  of  the  following  properties:  aqueous  solubility, 
octanol/water  partition  coefficients,  vapor  pressure,  Henry's  Law  constants,  organic  carbon 
based  soil  sorption  coefficients,  and  bioconcentration  factors. 

At  the  time  of  this  report,  the  three  property  estimation  modules  are  implemented,  the  MCI 
module  is  approximately  90%  complete  while  the  UNIFAC  and  property-property  modules  are 
approximately  50%  complete. 


SECOND  YEAR  OBJECTIVES 

In  the  original  proposal,  the  major  focus  of  the  second  year  was  to  port  the  program 
developed  on  the  Macintosh  over  to  an  MS-DOS  platform.  After  reviewing  the  results  obtained 
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in  the  first  year,  the  investigators  feel  that  further  development  and  refining  of  the  structural 
property  relationships  and  linking  PEP  to  an  environmental  fate  model  would  be  more 
important  than  implementing  PEP  on  a  different  computer  platform.  Thus,  we  propose  to 
accomplish  the  following  tasks  in  the  second  year  of  the  project: 

1 .  Complete  UNIFAC  and  Property-property  modules. 

2.  Refine  MCI-property  relationships  by  examining  the  relationship  between  MCI  and 
properties  such  as  total  molecular  surface  area,  polarizability,  and  dipole  moment. 

3.  Continue  to  develop  chemical  property  database  for  current  compounds  and  expand 
database  by  adding  biotic  and  abiotic  transformation  related  properties  such  as 
biodegradation,  hydrolysis,  and  photolysis  rates. 

4.  Investigate  the  relationship  between  MCI  and  transformation  rates. 

5.  Link  PEP  with  VIP  (Vadose  Zone  Interactive  Processes)  to  demonstrate  the  utility  of 
combining  a  property  estimation  program  with  an  environmental  fate  model. 

6.  If  time  permits,  attempt  to  port  PEP  and  associated  database  over  to  a  MS-DOS  platform. 
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Summary  Statistics  Cards  for  MCl-property  relationships 


STATISTICS  Class:  S  Universal 


Regression  Results 


Std. 

Variable  Coef.  Error  t 


Constant 

0.3917 

0.1376 

2.85 

a 

vpl 

-.9257 

0.0316 

-29.3 

■ 

Avpl 

1  .8251 

0.1047 

17.4 

<> 

Anaigsis  of  Variance  Table 


Source 

RSS 

df 

MSS 

F 

Regression 

889.176 

2 

445 

446 

Residual 

360.920 

362 

0.997 

Total 

1250.096 

364 

3.4343 

r2=  71.1% 

Robs  - 

365 

S  = 

0.9985 

Predicted  vs.  Exp.  Residual  vs.  Predicted  Residual  vs.  Prob. 


STATISTICS  Class:  S  Universal  -ionizable 


Regression  Results 

Std. 


Variable 

Coef. 

Error 

t 

Constant 

0.437 

0.1379 

3.17 

a 

vpl 

-.9748 

0.0323 

-30.2 

Avpl 

2.129 

0.1282 

16.6 

■ 

Anaigsis  of  Variance  Table 


Source 

RSS 

df 

MSS 

F 

Regression 

793.809 

2 

397 

462 

Residual 

245.514 

286 

0.8584 

Total 

1039.323 

288 

3.6088 

r2=  76.4% 

Robs  - 

289 

S  = 

0.9265 

Predicted  vs.  Exp.  Residual  vs.  Predicted  Residual  vs.  Prob. 


Summary  Statistics  Cards  for  MCl-property  relationships 


STATISTICS  Class:  rion-polar,  non-cyclic 


mmim® 


wmmmm 


mmmem 


Regression  Results 


Analysis  of  Variance  Table 


Std. 


Variable 

Coef. 

Error 

t 

Source 

RSS 

df 

MSS 

F 

Constant 

2.0477 

0.3149 

6.5 

O 

Regression 

52.8732 

2 

26.44 

1  4  1 

npl 

-2.0254 

0.1515 

-13.4 

r 

Residual 

6.00701 

32 

0.1877 

np3 

0.9608 

0.1625 

5.91 

w 

Total 

58.88021 

34 

1.7318 

5 

r2=  89.8% 

n©bs  — 

35 

S  = 

0.4333 

Predicted  vs.  Exp. 


Residual  vs.  Predicted 


Residual  vs.  Prob. 
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STATISTICS  Class:  S  Oxygenated  aliphatics 


Regression  Results _  Analysis  of  Variance  Table 


Variable 

Coef. 

Std. 

Error 

t 

Source 

RSS 

df 

MSS 

F 

Constant 

2.9022 

0.074 

39.2  < 

Regression 

196.290 

i 

196 

4240 

npl 

-1  .1  561 

0.0178 

-65.1  " 

ll  1 

Residual 

3.65699 

79 

0.0463 

(!? 

Total 

199.9469' 

80 

2.4993 

< 

> 

r2=  98.2% 

Robs  - 

81 

S  = 

0.2152 

Log  s  vs.  MCI  Residual  vs.  Predicted  Residual  vs.  Prob. 


Summary  Statistics  Cards  for  MCl-property  relationships 


STATISTICS  Class:  S  non-polar,  Aromatic 


Regression  Results _  Analysis  of  Variance  Table 


Variable 

Coef. 

Std. 

trror 

t 

Source 

RSS 

df 

MSS 

F 

Constant 

0.2478 

0.2122 

1  .17 

a 

Regression 

1 69.662 

2 

84.8 

349 

vpl 

-1.1114 

0.0579 

-19.2 

■ 

Residua! 

26.9593 

1  1  1 

0.2429 

vp6 

0.6617 

0.1107 

5.98 

Total 

196.6213 

1  1  3 

1  .74 

m 

H 

r2=  86.3% 

Robs  — 

1  1  4 

S  = 

0.4928 

Predicted  vs.  Exp.  Residual  vs.  Predicted  Residual  vs.  Prob. 


STATISTICS  Class:  s  pcbs 


Regression  Results 


mtmm. 


Analysis  of  Variance  Table 


Variable 

Coef. 

Std. 

Error 

t 

Source 

RSS 

df 

MSS 

F 

Constant 

-1  .8528 

0  2204 

-8.41 

s 

HI 

Regression 

6.50320 

i 

6.5032 

288 

bp6 

-6.2017 

0.3654 

-  1  7 

Residual 

0.158026 

7 

0.0226 

IS 

1  KM 
■  • 

Total 

6.661226 

8 

0.8327 

<> 

r2=  97.6% 

Robs  = 

9 

S  = 

0.1503 

Log  S  vs.  MCI 


Residual  vs.  Predicted  Residual  vs.  Prob. 
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Summary  Statistics  Cards  for  MCl-property  relationships 


STATISTICS  Class:  S  ureas 


Regression  Results 


Variable 

Coef. 

Std. 

Error 

t 

Source 

RSS 

df 

MSS 

F 

Constant 

3.758 

0.4631 

8.12  < 

> 

Regression 

32.5444 

i 

32.54 

177 

bpl 

-1  .0747 

0.0808 

-13.3  " 

Residual 

1.65604 

9 

0.1  84 

Total 

34.20044 

1  0 

3.42 

< 

> 

r2=  95.2% 

n0bs  = 

1  1 

S  = 

0.4290 

Log  S  vs.  MCI 


Residual  vs.  Predicted  Residual  vs.  Prob. 
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STATISTICS  Class:  S  Anilines 


Regression  Results _  Analysis  of  Variance  Table 


Variable 

Coef. 

std. 

Error 

t 

Source 

RSS 

df 

MSS 

F 

Constant 

0.9892 

0.401 

2.47 

o 

Regression 

45.5658 

i 

45.57 

95.0 

bpl 

-.7157 

0.0734 

-9.75 

Residual 

9.10898 

1  9 

0.4794 

Tj!jT 

Total 

54.67478 

20 

2.7337 

2 

r2=  83.3% 

bobs  = 

21 

S  = 

0.6924 

Log  S  vs.  MCI  Residual  vs.  Predicted  Residual  vs.  Prob. 


Summary  Statistics  Cards  for  MCl-property  relationships 

STATISTICS  Class:  s  Alcohols 


Regression  Results 


Analysis  of  Variance  Table 


Std. 


Variable 

Coef. 

Error 

t 

Source 

RSS 

df 

MSS 

F 

Constant 

2.8524 

0.0839 

34 

s 

II  ■ 

Regression 

HU**  a 

1 

1  68 

3248 

bpl 

-1  .1446 

0.0201 

-57 

Residual 

■ 

66 

0.051  7 

m 

1  EM  -  H' 

Total 

67 

2.5576 

s 

r2=  98.0% 

n0bs  - 

68 

S  = 

0.2274 

Log  S  vs.  MCI 


Residual  vs.  Predicted  Residual  vs.  Prob. 
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STATISTICS  Class:  S  Halogenated  benzenes 


Regression  Results _  Analysis  of  Variance  Table 


Variable 

Coef. 

Std. 

Error 

t 

Source 

RSS 

df 

MSS 

F 

Constant 

-.7166 

0.2099 

-3.41 

O 

Regression 

15.5443 

2 

7.772 

1  02 

nc3 

-.8638 

0.3031 

-2.85 

r 

Residual 

2.21780 

29 

0.0765 

vpO 

-.3291 

0.0405 

-8.12 

w 

Total 

17.7621 

31 

0.573 

5 

r2=87.5% 

bobs  = 

32 

S  = 

0.2765 

Predicted  vs.  Exp.  Residual  vs.  Predicted  Residual  vs.  Prob. 


Summary  Statistics  Cards  for  MCl-property  relationships 

STATISTICS  Class:  s  pahs 


Regression  Results 


Analysis  of  Variance  Table 


Variable 

Coef. 

Std. 

Error 

t 

Constant 

-1  .1  353 

0.2894 

-3.92 

O 

npl 

-.501  1 

0.0373 

-13.4 

<> 

Log  S  vs.  MCI 


Source 

RSS 

df 

MSS 

F 

Regression 

Residual 

Total 

30.5352 

5.74826 

36.28346 

1 

34 

35 

30.54 

0.1691 

1 .0367 

181 

r2=84.2% 

Hobs  — 

36 

s  = 

0.41  12 

Residual  vs.  Predicted  Residual  vs.  Prob. 
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STATISTICS  Class:  S  Polar  Aromatics 


Regression  Results 


Variable 

Coef. 

Std. 

Error 

t 

Constant 

1.4051 

0.3001 

4.68  < 

vpO 

-.4987 

0.0396 

-12.6  “ 

Avpl 

0.5967 

0.2204 

2.71  |r 

< 

Predicted  vs.  Exp. 


Analysis  of  Variance  Table 


Source 

RSS 

df 

MSS 

F 

Regression 

174.537 

2 

87.3 

97.7 

Residual 

96.5006 

108 

0.8935 

Total 

271 .0376 

110 

2.464 

r2=  64.4% 

bobs  — 

1  1  1 

S  = 

0.9453 

Residual  vs.  Predicted  Residual  vs.  Prob. 
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STATISTICS  Class:  PvPCBs 


Regression  Results _  Analysis  of  Variance  Table 


Variable 

Coef. 

Std. 

Error 

t 

Source 

RSS 

df 

MSS 

F 

Constant 

9.4791 

1 .815 

5.22 

& 

Regression 

49.8285 

2 

24.91 

1  1  5 

np3 

-2.4284 

0.4481 

-5.42 

Residual 

2.60424 

1  2 

0.217 

nc5 

5.8143 

2.15 

2.7 

TiTjT 

Total 

52.43274 

1  4 

3.7452 

s 

r2=95.0% 

Hobs  = 

1  5 

S  = 

0.4659 

Predicted  vs.  Exp.  Residual  vs.  Predicted  Residual  vs.  Prob. 


STATISTICS  Class:  Pv  Universal 


Regression  Results _  Analysis  of  Variance  Table 


Variable 

Coef. 

Std. 

Error 

t 

Source 

RSS 

df 

MSS 

F 

Constant 

5.261 1 

0.1801 

29.2 

o 

Regression 

453.996 

i 

454 

672 

np3 

-1  .2746 

0.0491 

-25.9 

“ 

Residual 

62.7848 

93 

0.675 

Total 

516.7808 

94 

5.4977 

9 

r2=  87.9% 

Robs  = 

95 

S  = 

0.8216 

Log  Pv  vs.  MCI  Residual  vs.  Predicted  Residual  vs.  Prob. 
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Summary  Statistics  Cards  for  MCl-property  relationships 


STATISTICS  Class:  Pv  pahs 
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Regression  Results 


Variable 

Coef. 

Std. 

Error 

t 

Constant 

8.7168 

0.4656 

18.7  O 

npl 

-1  .4892 

0.0702 

-21.2  — 

<> 

Log  Pv  vs.  MCI 


Analusis  of  Variance  Table 


Source 

RSS 

df 

MSS 

F 

Regression 

Residual 

Total 

35.5950 

0.712338 

36.307331 

i 

9 

1  0 

35.6 

0.0791 

3.6307 

450 

r2=  98.0% 

Hobs  = 

1  1 

S  = 

0.2813 

Residual  vs.  Predicted  Residual  vs.  Prob. 


>  c.o 

CL 

O' 

£  -3.0 


Help 


6  7 

npl 


JA 

1  0.2 

•o 

w 

£  -0.2 


g  o.2 

■o 

2-n? 

QL  U  * 


-4.5  -1.5  0.0 

predicted 


List  n  Return  to  PEP 


-i  o  1 
number  of  standard 
deviations 


data  base 


STATISTICS  Class:  Pv  Halogenated  Aromatics 


Regression  Results 


Std. 

Variable  Coef.  Error  t 


Constant  6.622  0.349  19 

bp  1  -1.5592  0.0701  -22.2 


Analysis  of  Variance  Table 


Source 

RSS 

df 

MSS  F 

Regression 

147.169 

i 

147  494 

Residual 

7.14780 

24 

0.2978 

Total 

154.3168 

25 

6.1  727 

r2=95.4% 

Hobs  = 

26 

S  =  0.5457 

Log  Pv  vs.  MCI 


Residual  vs.  Predicted 


Residual  vs.  Prob. 
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STATISTICS  Class:  Kow  Universal 


mmmmmmmmmmmmmmmwmmmm 


Regression  Results 


Variable 

Coef. 

Std. 

Error 

t 

Constant 

1.1527 

0.141 1 

mm* 

m 

vpO 

0.3978 

0.0175 

22.7  r 

■1 

AvpO 

-1  .9843 

0.1234 

- 1  6. 1  |jf 

< 

> 

Predicted  vs.  Exp.  Residual 


Analysis  of  Variance  Table 


Source 

RSS 

df 

MSS 

F 

Regression 

Residual 

Total 

335.264 

102.809 

438.073 

2 

1  93 
195 

1  68 

0.5327 

2.2465 

315 

r2=  76.5% 

n©bs  = 

196 

s  = 

0.7299 

vs.  Predicted  Residual  vs.  Prob. 


STATISTICS  Class:  Kow  ureas 


Regression  Results _  Analysis  of  Variance  Table 


Variable 

Coef. 

Std. 

Error 

t 

Source 

RSS 

df 

MSS 

F 

Constant 

-3.9333 

0.2288 

-1  7.2 

iHi 

mm 

Regression 

if w.um 

i 

18.83 

504 

vpl 

1 .2594 

0.0561 

22.5 

Residual 

5 

0.0373 

m 

Total 

KEE  i  ml 

6 

3.1694 

<> 

r2=  99.0% 

bobs  - 

7 

S  = 

0.1932 

Log  Kow  vs.  MCI  Residual  vs.  Predicted  Residual  vs.  Prob. 


Summary  Statistics  Cards  for  MCl-property  relationships 


la r  Aromatics 


Regression  Results _  Analysis  of  Variance  Table 


Variable 

Coef. 

Std. 

Error 

t 

Source 

RSS 

df 

MSS 

F 

Constant 

1 .8347 

0.2632 

6.97 

E 

Regression 

39.0336 

2 

19.52 

52.9 

vpO 

0.2869 

0.028 

10.2 

Residual 

14.7522 

40 

0.3688 

AvpO 

-1  .4152 

0.2617 

-5.41 

m 

m  - 

iHk 

Total 

53.7858 

42 

1  .2806 

r2=  72.6% 

Hobs  ~ 

43 

S  = 

0.6073 

Predicted  vs.  Exp.  Residual  vs.  Predicted  Residual  vs.  Prob. 


STATISTICS  Class:  Kow  pcbs 


Regression  Results  Analusis  of  Variance  Table 


Variable 

Coef. 

Std. 

Error 

t 

Source 

RSS 

df 

MSS  F 

Constant 
bp4 
bp  6 

3.554 

-.9026 

5.8313 

0.1888 

0.2567 

0.6189 

18.8  O 
-3.52  “ 

9.42  y 

Regression 

Residual 

Total 

9.80627 

0.382424 

10.18869 

2 

1  3 

1  5 

4.9031  1  67 

0.0294 

0.6792 

O  r2=96.2%  n0bs=  16  S=  0.1715 


Predicted  vs.  Exp.  Residual  vs.  Predicted  Residual  vs.  Prob. 


Summary  Statistics  Cards  for  MCl-property  relationships 


STATISTICS  Class:  Kow  pah's 


Regression  Results 


std. 

Variable  Coef.  Error  t 


Constant 

0.7812 

0.218 

3.58 

s 

npO 

0.4103 

0.0206 

19.9 

■ 

ms 

w 


Analysis  of  Variance  Table 


Source 

RSS 

df 

MSS 

F 

Regression 

10.9031 

i 

10.90 

395 

Residual 

1  1 

0.0276 

Total 

11.20669: 

1  2 

0.9339 

r2=  97.3% 

Hobs  = 

1  3 

S  = 

0.1661 

Log  Kow  vs.  MCI  Residual  vs.  Predicted  Residual  vs.  Prob. 
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STATISTICS  Class:  Kow  nonpolar  aromatic 


Regression  Results 


Std. 

Variable  Coef.  Error  t 


Constant 

1 .2033 

0.1598 

7.53 

SI 

vpO 

0.4146 

0.0187 

22.2 

■ 

o 

Analysis  of  Variance  Table 


Source _ RSS _ df  MSS _ F 


Regression 

102.413 

1 

102 

494 

Residual 

17.6377 

85 

0.2075 

Total 

120.0507 

86 

1  .3959 

r2=85.3% 

bobs  = 

87 

S  = 

0.4555 

Log  Kow  vs.  MCI  Residual  vs.  Predicted  Residual  vs.  Prob. 
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STATISTICS  Class:  Kow  noncyclic  Oxygenated 


Regression  Results _  Analgsis  of  Variance  Table 


Variable 

Coef. 

Std. 

Error 

t 

Source 

RSS 

df 

MSS 

F 

Constant 

-1  .6181 

0.1489 

-10.9 

s 

II 

Regression 

28.7287 

1 

28.73 

713 

bpl 

1 .0591 

0.0397 

26.7 

Residual 

0.725310 

1  8 

0.0403 

i 

in 

Total 

29.45401 

1  9 

1  .5502 

<> 

r2=97.5% 

Hobs  = 

20 

S  = 

0.2007 

Log  Kow  vs.  MCI  Residual  vs.  Predicted  Residual  vs.  Prob. 


STATISTICS  Class:  Kow  noncyclic  nonpolar 


Regression  Results 


Variable 

Coef. 

Std. 

Error 

t 

Source 

RSS 

df 

MSS 

F 

Constant 

-.2407 

0.2418 

-.995  < 

> 

Regression 

37.0139 

i 

37.01 

225 

bpl 

1.2633 

0.0842 

1 5  r 

Residual 

5.26621 

32 

0.1646 

IT 

IT 

Total 

42.2801 

33 

1.2812 

< 

> 

r2=87.5% 

bobs  ~ 

34 

S  = 

0.4057 

Log  Kow  vs.  MCI  Residual  vs.  Predicted  Residual  vs.  Prob. 
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Summary  Statistics  Cards  for  MCl-property  relationships 

STATISTICS  Class:  Kow  anilines 


mmsmssmsm 


Regression  Results _  Analysis  of  Variance  Table 


Variable 

Coef. 

Std. 

Error 

t 

Source 

ESS 

df 

MSS 

F 

Constant 

-1  .3184 

0.2381 

-5.54 

s 

II  ■ 

Regression 

18.5293 

2 

9.265 

1  72 

bpl 

1 .0828 

0.0878 

12.3 

Residual 

0.646843 

1  2 

0.0539 

Avp5 

-3.9803 

0.6307 

-6.31 

m 

i 

m 

Total 

19.17614: 

1  4 

1 .3697 

<> 

r2=  96.6% 

Robs  — 

1  5 

S  = 

0.2322 

Predicted  vs.  Exp.  Residual  vs.  Predicted  Residual  vs.  Prob. 


STATISTICS  Class:  Kow  Alcohols 


Regression  Results _  Analysis  of  Variance  Table 


Variable 

Coef. 

Std. 

Error 

t 

Source 

RSS 

df 

MSS 

F 

Constant 

-1  .9613 

0.1281 

-15.3 

E 

mm 

Regression 

5.03456 

i 

5.0346 

1  085 

bpl 

1.1665 

0.0354 

32.9 

■ 

Residual 

0.013924 

3 

0.0046 

m  . 

Total 

5.048484 

4 

1  .2621 

5 

r2=  99.7% 

Robs  = 

5 

S  = 

0.0681 

Log  Kow  vs.  MCI  Residual  vs.  Predicted  Residual  vs.  Prob. 


Summary  Statistics  Cards  for  MCl-property  relationships 


STATISTICS  Class:  Koc  Universal 


Regression  Results 

Std. 


Variable 

Coef. 

Error 

t 

Constant 

0.61  1  1 

0.1767 

3.46 

O 

npl 

0.4647 

0.0289 

16.1 

- 

AvpO 

-1  .2243 

0.1  176 

-10.4 

[jiHH 

$ 

mmm 


Analysis  of  Variance  Table 


Source _ RSS  df  MSS _ F 


Regression 

159.141 

2 

79.6 

138 

Residual 

96.6154 

1  68 

0.5751 

Total 

255.7564 

1  70 

1  .5044 

r2=  62.2% 

Robs  - 

171 

S  = 

0.7583 

Predicted  vs.  Exp.  Residual  vs.  Predicted  Residual  vs.  Prob. 


Experimental  log  Koc 


predicted 


-1.5  0 .0  1  .5 

number  of  standard 
deviations 


STATISTICS  Class:  Koc  Ureas  2 


Regression  Results _  _ Analysis  of  Variance  Table 


Variable 

Coef. 

Std. 

Error 

t 

Source 

RSS 

df 

MSS 

F 

Constant 

-.4476 

0.4502 

-.994 

H 

Regression 

7.64836 

2 

3.8242 

24.8 

Avpl 

-.0235 

0.4433 

-.053 

■ 

Residual 

2.92537 

1  9 

0.154 

npO 

0.2568 

0.0385 

6.68 

m 

Total 

10.57373 

21 

0.5035 

O 

r2=  73.2% 

Robs  = 

22 

S  = 

0.3924 

Predicted  vs.  Exp.  Residual  vs.  Predicted  Residual  vs.  Prob. 


Summary  Statistics  Cards  for  MCl-property  relationships 


STATISTICS  Class:  Koc  Ureas 


Regression  Results _  Analysis  of  Variance  Table 


Variable 

Coef. 

Std. 

Error 

t 

Source 

RSS 

df 

MSS 

F 

Constant 

0.8342 

0.1578 

5.29 

& 

Regression 

9.29029 

2 

4.6451 

68.8 

np6 

1  .6588 

0.148 

11.2 

Residual 

1.28344 

1  9 

0.0675 

nch6 

-6.2556 

1  .515 

-4.13 

Total 

10.5737 

21 

0.5035 

<} 

T2=87.9% 

bobs  = 

22 

s  = 

0.2599 

Predicted  vs.  Exp.  Residual  vs.  Predicted  Residual  vs.  Prob. 


STATISTICS  Class:  Koc  Triazines 


Regression  Results _  Analysis  of  Variance  Table 


Variable 

Coef. 

Std. 

Error 

t 

Source 

RSS 

df 

MSS 

F 

Constant 

-.5643 

0.7221 

-.781 

E 

mm 

Regression 

1.60125 

2 

0.80062 

32.6 

bpl 

0.6296 

0.0919 

6.85 

Residual 

0.221318 

9 

0.0246 

AvpO 

-.9382 

0.3763 

-2.49 

m 

1 

Total 

1 .8226 

1  1 

0.1657 

<> 

r2=  87.9% 

bobs  — 

1  2 

S  = 

0.1568 

Predicted  vs.  Exp.  Residual  vs.  Predicted  Residual  vs.  Prob. 


Summary  Statistics  Cards  tor  MCl-property  relationships 


STATISTICS  Class:  Koc  Polar,  non-aromatic,  non-cycilic 


Regression  Results _  Analysis  of  Variance  Table 


Variable 

Coef. 

Std. 

Error 

t 

Source 

RSS 

df 

MSS 

F 

Constant 

0.0615 

0.23 

0.267  < 

Regression 

10.3570 

2 

5.178 

80.2 

npO 

0.2486 

0.0234 

io.6  r 

Residual 

0.516495 

8 

0.0646 

Avpl 

0.3348 

0.0762 

4.39  £ 

Total 

10.87349: 

1  0 

1 .0873 

X 

i 

r2=  95.2% 

Robs  - 

1  1 

S  = 

0.2541 

Predicted  vs.  Exp.  Residual  vs.  Predicted  Residual  vs.  Prob. 


STATISTICS  Class:  Koc  PCBs 


Regression  Results 


Variable  Coef. 


Std. 

Error 


Constant 

0.0018 

BBS! 

0.001 

E 

nc3 

3.3677 

4.21 

■ 

S888 

<> 

Source 

RSS 

df 

MSS 

F 

Regression 

1 .84147 

i 

1  .8415 

17.8 

Residual 

0.518298 

0.1037 

Total 

2.359768 

0.3933 

r2=  78.0% 

Robs  - 

7 

S  = 

0.3220 

Log  Koc  vs.  MCI 


Residual  vs.  Predicted  Residual  vs.  Prob. 
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Summary  Statistics  Cards  for  MCl-property  relationships 


STATISTICS  Class:  Koc  pahs 


XvX'X'X'XvXv.w 


Regression  Results 


Analqsis  of  Variance  Table 


Variable 

Coef. 

Std. 

Error 

t 

Constant 

0.2187 

0.322 

0.679  O 

npl 

0.5672 

0.0405 

1 4  n 

5 

Source 

RSS 

df 

MSS 

F 

Regression 

24.1463 

i 

24.15 

196 

Residual 

1.6001  1 

1  3 

0.1231 

Total 

25.74641 

1  4 

1  .839 

r2=  93.8% 

hobs  - 

1  5 

s  = 

0.3508 

Log  Koc  vs.  MCI 


Residual  vs.  Predicted  Residual  vs.  Prob. 
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npl 


Help 
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data  base 


STATISTICS  Class:  Koc  non-polar,  non-aromatic,  non- 


i mmmmm 


Regression  Results 


Analusis  of  Variance  Table 


Variable 

Coef. 

Std. 

Error 

t 

Source 

RSS 

df 

MSS 

F 

Constant 

0.3614 

0.3952 

IlKUflSH 

Regression 

0.8151 1 1 

i 

0.8151 

14.6 

bpO 

0.3604 

0.0944 

3.82 

Residual 

0.5591 14 

1  0 

0.0559 

Total 

1.374225 

1 1 

0.1249 

<>  I 

r2=59.3% 

Hobs  = 

1  2 

S  = 

0.2365 

Log  Koc  vs.  MCI  Residual  vs.  Predicted  Residual  vs.  Prob. 


o 

3  2.1 


J2  0  2 


^  0.2 

«!  0  2 
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3.00  3.75  4.50 
bpO 


1.50  1.75  2.00  2.25 
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0.4  0.6  0.8  1.0 
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deviations 


Summary  Statistics  Cards  for  MCl-property  relationships 


STATISTICS  Class:  Koc  non-polar,  Aromatics 


Regression  Results 

Std. 


Variable 

Coef. 

Error 

t 

Constant 

0.3889 

0.1777 

2.19 

G 

npl 

0.5983 

0.0273 

21 .9 

bch6 

-3.4659 

0.484 

-7.16 

<> 

Analysis  of  Variance  Table 


Source 

RSS 

df 

MSS 

F 

Regression 

60.5243 

2 

30.26 

240 

Residual 

4.54683 

36 

0.1263 

Total 

65.071  13 

38 

1  .7124 

r2=  93.0% 

Robs  - 

39 

S  = 

0.3554 

Predicted  vs.  Exp.  Residual  vs.  Predicted  Residual  vs.  Prob. 


_  .  , ,  number  of  standard 

Experimental  log  Koc  predicted  deviations 


STATISTICS  Class:  Koc  Halogenated  Aromatics 


.  .  / 


Regression  Results _  Analysis  of  Variance  Table 

Std. 


Variable  Coef.  Error  t _  Source _ RSS _ df  MSS _ F 


Constant 

-.2045 

0.5708 

-.358 

S 

Regression 

2.03980 

1 

2.0398 

31  .6 

npl 

0.7411 

0.1318 

5.62 

Residual 

0.515841 

8 

0.0645 

m 

Total 

2.555641 

9 

0.284 

O 

r2=  79.8% 

Robs  = 

1  0 

s  = 

0.2539 

Log  Koc  vs.  MCI  Residua]  vs.  Predicted  Residual  vs.  Prob. 


npl  predicted 


Return  to  PEP 


number  of  standard 
deviations 


Summary  Statistics  Cards  for  MCl-property  relationships 

STATISTICS  Class:  Koc  Carbamates 


Regression  Results _  Anaigsis  of  Variance  Table 


Variable 

Coef. 

Std. 

Error 

t 

Source 

RSS 

df 

MSS 

F 

Constant 

0.1814 

0.3295 

0.551 

s 

Regression 

2.63502 

i 

2.635 

48.4 

vpO 

0.2523 

0.0363 

6.96 

Residual 

0.598675 

1 1 

0.0544 

Total 

3.233695 

1  2 

0.2695 

r2=8i.5% 

nobs  - 

1  3 

S  = 

0.2333 

Log  Koc  vs.  MCI  Residual  vs.  Predicted  Residual  vs.  Prob. 


STATISTICS  Class:  Koc  Anilines 


s 'mm 


Regression  Results _  Anaigsis  of  Variance  Table 


Variable 

Coef. 

Std. 

Error 

t 

Source 

RSS 

df 

MSS 

F 

Constant 

0.1424 

0.3346 

0.425 

ill 

mm 

Regression 

9.94209 

2 

4.971 

45.6 

npl 

0.5734 

0.0776 

7.39 

■ 

Residual 

0.763154 

7 

0.109 

AvpO 

-1  .2366 

0.3295 

-3.75 

ii 

Total 

9 

1 .1895 

<> 

r2=  92.9% 

nobs  — 

1  0 

S  = 

0.3302 

Predicted  vs.  Exp.  Residual  vs.  Predicted  Residual  vs.  Prob. 


Summary  Statistics  Cards  tor  MCl-property  relationships 

STATISTICS  Class:  Koc  Acetanilide 


Regression  Results _  Analysis  of  Variance  Table 


Variable 

Coef. 

Std. 

Error 

t 

Source 

RSS 

df 

MSS 

F 

Constant 

0.3745 

0.3781 

0.99 

o 

Regression 

1 .58075 

2 

0.7904 

1  1  .8 

Avpl 

-1  .2323 

0.5152 

-2.39 

”1 

Residual 

0.403537 

6 

0.0673 

npl 

0.3512 

0.0742 

4.74 

Total 

1 .984287 

8 

0.248 

<> 

r2=  79.7% 

Hobs  = 

9 

S  = 

0.2593 

Predicted  vs.  Exp.  Residual  vs.  Predicted  Residual  vs.  Prob. 


Experimental  log  Koc 


predicted 


— I - 1 - 1 - 1 - t 

-1.50  -1.00  -05 

number  of  standard 
deviations 


STATISTICS  Class:  H  Universal 


Regression  Results 


Std. 

Variable  Coef.  Error  t 


Constant 

0.0003 

0.2096 

Tooi  r 

a 

npl 

-.2496 

0.0422 

Avpl 

-1  .667 

0.2082 

■ 

1 

Analysis  of  Variance  Table 


Source 

RSS 

df 

MSS 

F 

Regression 

93.9802 

2 

46.99 

74.7 

Residual 

45.9296 

73 

0.6292 

Total 

139.9098 

75 

1  .8655 

r2=  67.2% 

bobs  = 

76 

S  = 

0.7932 

Predicted  vs.  Exp.  Residual  vs.  Predicted  Residual  vs.  Prob. 


Summary  Statistics  Cards  for  MCl-property  relationships 

STATISTICS  Class:  H  nonpolar  Aromatics 

_ Regression  Results _  Analysis  of  Variance  Table 


Std. 


Variable 

Coef. 

Error 

t 

Source 

RSS 

df 

MSS 

F 

Constant 

-.9387 

0.1 182 

-7.94 

Regression 

17.3912 

3 

5.797 

79.7 

np6 

-1  .5775 

0.1463 

-1  0.8 

n 

Residual 

2.47279 

34 

0.0727 

bp4 

1.5632 

0.209 

7.48 

Total 

19.86399 

37 

0.5369 

vp3 

-.3061 

0.1 107 

-2.76 

r2=  87.6% 

Hobs  — 

38 

S  = 

0.2697 

Predicted  vs.  Exp.  Residual  vs.  Predicted  Residual  vs.  Prob. 


STATISTICS  Class:  H  Halogenated  Aliphatics 


Regression  Results 

Std. 


Variable 

Coef. 

Error 

t 

Constant 

0.7325 

0.2612 

2.8 

o 

vpi 

-1  .2237 

0.1559 

-7.85 

nr 

np2 

0.6848 

0.0961 

7.13 

<> 

Analysis  of  Variance  Table 


Source 

RSS 

df 

MSS 

F 

Regression 

6.39442 

2 

3.1972 

33.9 

Residual 

1 .60216 

1  7 

0.0942 

Total 

7.99658 

1  9 

0.4209 

r2=80.0% 

Hobs  - 

20 

S  = 

0.3070 

Predicted  vs.  Exp. 


Residual  vs.  Predicted  Residual  vs.  Prob. 
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Summary  Statistics  Cards  for  MCl-property  relationships 


STATISTICS  Class:  hpcbs 


_ Regression  Results 

Std. 


Variable 

Coef. 

Error 

t 

Constant 

-2.477 

0.099 

-25 

m 

npc6 

0.3676 

0.0247 

14.9 

vpc4 

-1 .1555 

0.0895 

-12.9 

5 

Analysis  of  Variance  Table 


Source 

RSS 

df 

MSS 

F 

Regression 

3.261 15 

2 

1  .6306 

1  1  1 

Residual 

0.175504 

1  2 

0.0146 

Total 

3.436654 

1  4 

0.2455 

r2=  94.9% 

Hobs  — 

1  5 

s  = 

0.1209 

Predicted  vs.  Exp.  Residual  vs.  Predicted  Residual  vs.  Prob. 


STATISTICS  Class:  h  pahs 


Regression  Results 


std. 

Variable  Coef.  Error  t 


Constant 

0.8541 

0.2958 

2.89 

k> 

npl 

-.5197 

0.0495 

-10.5 

r 

mmmmmmmms 


Analysis  of  Variance  Table 


Source _ RSS _ df  MSS _ F 


Regression 

0.687162 

1 

0.6872 

1  1  0 

Residual 

0.024924 

4 

0.0062 

Total 

0.712086 

5 

0.1424 

r2=  96.5% 

Robs  - 

6 

s  = 

0.0789 

Log  H  vs.  MCI 


Residual  vs.  Predicted  Residual  vs.  Prob. 
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Summary  Statistics  Cards  for  MCl-property  relationships 

STATISTICS  Class:  BCF  Universal 


. 


Regression  Results 


Analgsis  of  Variance  Table 


Variable 

Coef. 

Std. 

Error 

t 

Source 

RSS 

df 

MSS 

F 

Constant 

0.9816 

0.2329 

4.21 

-0 

Regression 

60.1 143 

2 

30.06 

61.6 

npl 

0.4347 

0.0415 

10.5 

Residual 

32.7178 

67 

0.4883 

AvpO 

-1  .9999 

0.2886 

-6.93 

w 

Total 

92.8321 

69 

1 .3454 

9 

r2=  64.8% 

Hobs  = 

70 

S  = 

0.6988 

Predicted  vs.  Exp. 
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Experimental  log  BCF 
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STATISTICS  Class:  bcfpcbs 

■■■■■■■— — — mi 

_ Regression  Results _  Analgsis  of  Variance  Table 


std. 


Variable 

Coef. 

Error 

t 

Source 

RSS 

df 

MSS 

F 

Constant 

7.5756 

0.6252 

12.1 

& 

Regression 

0.641375 

i 

0.6414 

27.9 

nch6 

-21  .162 

4.009 

-5.28 

Residual 

0.138126 

6 

0.023 

TjTjT 

Total 

0.779501 

7 

0.1114 

9 

r2=82.3% 

Hobs  = 

8 

S  = 

0.1517 

Log  BCF  vs.  MCI  Residual  vs.  Predicted  Residual  vs.  Prob. 
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Summary  Statistics  Cards  for  MCl-property  relationships 

STATISTICS  Class:  BCF  Polar  Aromatics 


mimmmmmmmmmmmii 


Regression  Results _  Analysis  of  Variance  Table 


Variable 

Coef. 

Std. 

Error 

t 

Source 

RSS 

df 

MSS 

F 

Constant 

1 .8367 

0.5026 

3.65 

s 

mm 

Regression 

4.68185 

2 

2.3409 

1  1  .5 

AvpO 

-1  .3739 

0.4012 

-3.42 

1 

Residual 

2.65010 

1  3 

0.2039 

vpO 

0.1889 

0.0489 

3.87 

Total 

7.33195 

1  5 

0.4888 

<> 

r2=  63.9% 

Robs  = 

1  6 

S  = 

0.4515 

Predicted  vs.  Exp.  Residual  vs.  Predicted  Residual  vs.  Prob. 


STATISTICS  Class:  BCF  Halogenated  Aliphatics 

_ Regression  Results _  Analysis  of  Variance  Table 


Std. 


Variable 

Coef. 

Error 

t 

Source 

RSS 

df 

MSS 

F 

Constant 

-1 .721 6 

0.3819 

-4.51 

E 

HU 

Regression 

8.51622 

i 

8.51 62 

79.8 

npO 

0.632 

0.0707 

8.93 

■ 

Residual 

0.853741 

8 

0.1067 

m 

Total 

9.369961 

9 

1 .041  1 

r2=  90.9% 

Robs  = 

1  0 

S  = 

0.3267 

Log  BCF  vs.  MCI  Residual  vs.  Predicted  Residual  vs.  Prob. 


APPENDIX  B 


STATISTICS  Class  !S  Universal  from  Koc 


Regression  Results 


Variable 

Coef. 

Std. 

Error 

t 

Constant 

0.321341 

0.2966 

1  .08 

log  Koc 

-1  .27402 

0.0915 

-13.9 

■ 

o 

Results  in  units  of  mol/L 


Analysis  of  Variance  Table 


Source 

RSS 

df 

MSS 

f 

Regression 

152.404 

i 

1  52 

1  94 

Residual 

51  .8595 

66 

0.78575 

r2=  74.6%  n0bs  =  68  S  =  0.8804 


Log  S  vs.  Log  Koc  Residual  vs.  Predicted  Residual  vs.  Prob 


0  2  4  6  -6  -4  -2  0  -1.25  125 


log  Koc 


predicted 


number  of  standard  deviations 


data  base 


I  I 


m 


STATISTICS  Class.' S  Universal  from  Kow 


Regression  Results 


Analysis  of  Variance  Table 


std. 


Variable 

Coef. 

Error 
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